Search results for "Historical document"
showing 6 items of 6 documents
An Efficient Cooperative Smearing Technique for Degraded Historical Documents Images Segmentation
2020
Segmentation is one of the critical steps in historical document image analysis systems that determines the quality of the search, understanding, recognition and interpretation processes. It allows isolating the objects to be considered and separating the regions of interest (paragraphs, lines, words and characters) from other entities (figures, graphs, tables, etc.). This stage follows the thresholding, which aims to improve the quality of the document and to extract its background from its foreground, also for detecting and correcting the skew that leads to redress the document. Here, a hybrid method is proposed in order to locate words and characters in both handwritten and printed docu…
ICDAR 2021 Competition on Historical Document Classification
2021
International audience; This competition investigated the performance of historical document classification. The analysis of historical documents is a difficult challenge commonly solved by trained humanists. We provided three different classification tasks, which can be solved individually or jointly: font group/script type, location, date. The document images are provided by several institutions and are taken from handwritten and printed books as well as from charters. In contrast to previous competitions, all participants relied upon Deep Learning based approaches. Nevertheless, we saw a great performance variety of the different submitted systems. The easiest task seemed to be font grou…
Reducing the Human Effort in Text Line Segmentation for Historical Documents
2021
Labeling the layout in historical documents for preparing training data for machine learning techniques is an arduous task that requires great human effort. A draft of the layout can be obtained by using a document layout analysis (DLA) system that later can be corrected by the user with less effort than doing it from scratch. We research in this paper an iterative process in which the user only supervises and corrects the given draft for the pages automatically selected by the DLA system with the aim of reducing the required human effort. The results obtained show that similar DLA quality can be achieved by reducing the number of pages that the user has to annote and that the accumulated h…
Writer identification for historical handwritten documents using a single feature extraction method
2020
International audience; With the growth of artificial intelligence techniques the problem of writer identification from historical documents has gained increased interest. It consists on knowing the identity of writers of these documents. This paper introduces our baseline system for writer identification, tested on a large dataset of latin historical manuscripts used in the ICDAR 2019 competition. The proposed system yielded the best results using Scale Invariant Feature Transform (SIFT) as a single feature extraction method, without any preprocessing stage. The system was compared against four teams who participated in the competition with different feature extraction methods: SRS-LBP, SI…
A Robust Multi Stage Technique for Image Binarization of Degraded Historical Documents
2017
International audience; Document image binarization is a central problem in many document analysis systems. Indeed, it represents one of the basic challenges, especially in case of historical documents analysis. In this paper, we propose a novel robust multi stage framework that combines different existing document image thresholding methods for the purpose of getting a better binarization result. CLAHE technique is introduced to significantly enhance contrast in some poor images. The proposed method then uses a hybrid algorithm to partition image into foreground and background. A special procedure is finally applied in order to remove small noise and correct characters morphology. Experime…
Towards semantic modelling of cultural historical data.
2010
In this paper a practical method is presented for creating documentation of cultural historical targets using an event-centric core ontology. By using semantic documentation templates and an XML-based query language, a domain specific documentation model can be created and flexible user interfaces can be built easily for accessing and editing the documentation. Keywords. ontologies, cultural historical documentation, information retrieval peerReviewed